RAW SEQUENCE LISTING 



The Biotechnology Systems Branch of the Scientific and Technical 
Information Center (STIC) no errors detected. 



Application Serial Number: 
Source: 

Date Processed by STIC: 



/o/W: 

■ 



ENTERED 




Page 1 of 8 




3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 
17 
19 
21 
22 

24 

25 

27 

28 

30 

31 

33 

34 
36 
38 

40 

41 

42 

43 

45 

46 
48 
50 
52 
54 
56 
58 
60 
62 
64 
66 



RAW SEQUENCE LISTING DATE: 04/25/2006 

PATENT APPLICATION: US/10/825 , 692A TIME: 10:14:42 

Input Set : A:\substitute Sequence Listing.txt 
Output Set: N:\CRF4\04252006\J825692A.raw 

<110> APPLICANT: Hotez, Peter 
Ashcom, James 
Bdamchian, Mahnaz 
Zhan, Bin 
Wang, Yan 
Hawdon, John 
Loukas, Alexander 
Williamson, Angela 
Jones, Brian 
Bethony, Jeffrey 
Goud , Gaddam 
Botazzi, Maria E. 

Mendez, Susana 

<120> TITLE OF INVENTION: Hookworm Vaccine 
<13 0> FILE REFERENCE: 03740007aa 
<140> CURRENT APPLICATION NUMBER: 10/825, 692A 
<141> CURRENT FILING DATE: 2004-04-16 
<150> PRIOR APPLICATION NUMBER: US 60/329,533 
<151> PRIOR FILING DATE: 2001-10-17 
<150> PRIOR APPLICATION NUMBER: US 60/332,007 
<151> PRIOR FILING DATE: 2001-11-23 
<150> PRIOR APPLICATION NUMBER: US 60/375,404 
<151> PRIOR FILING DATE: 2002-04-26 
<150> PRIOR APPLICATION NUMBER: PCT US02/33106 
<151> PRIOR FILING DATE: 2002-10-17 
<160> NUMBER OF SEQ ID NOS: 116 
<170> SOFTWARE: Patentln version 3.3 
<210> SEQ ID NO: 1 
<211> LENGTH: 1451 
<2 12 > TYPE: DNA 

<2 13 > ORGANISM: Necator americanus 
<400> SEQUENCE: 1 

atgttttctc ctgtagtcgt cagtgtggta ttcacaatcg ccttctgcaa tgcgtctcca 60 

gcaagagaca gcttcggctg ctctaacagt gggataactg acagcgaccg gcaagcgttc 120 

ctcgacttcc acaacaatgc tcgtcgacgg gttgcgaaag gccttgagga tagcaactcc 180 

ggcaaactga atccagcgaa gaacatgtac aagctgtcat gggactgtgc aatggaacag 240 

cagcttcagg atgccatcca gtcatgccca agcggctttg ctgggattca aggtgttgcg 300 

cagaatacaa tgagctggtc aagctctggt ggataccccg atccatcggt aaagatagaa 360 

ccaacgctct ccggctggtg gagtggtgcg aaaaagaacg gcgtaggccc ggacaacaaa 420 

tacaccggtg gtggtctctt cgccttctct aacatggtat actccgaaac gacgaaactt 480 

ggctgcgctt acaaggtttg cggcactaaa ctggcggttt catgcatcta taatggagtc 540 

gggtacatca caaatcaacc tatgtgggag acaggtcagg cttgccagac aggagcagac 600 

tgctccactt acaagaactc aggctgcgag gacggccttt gcacgaaggg accagatgta 660 
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RAW SEQUENCE LISTING DATE: 04/25/2006 

PATENT APPLICATION: US/10/825, 692A TIME: 10:14:42 

Input Set : A:\substitute Sequence Listing.txt 
Output Set: N:\CRF4\04252006\J825692A.raw 



68 ccagaaacaa accagcagtg cccctcaaac accggaatga ctgattcagt cagagatact 

70 ttcctatcgg tgcacaatga gttcagatcg agtgttgccc gaggtctgga acccgacgct 

72 ctgggcggaa atgcaccaaa agcagctaaa atgctcaaga tggtgtatga ctgtgaagtg 

74 gaagcatcgg ccatcagaca tggaaataaa tgcgtctatc aacattctca tggtgaagac 

76 agacctggac taggagaaaa catctacaaa actagtgtac tcaaattcga caagaacaaa 

78 gcagccaagc aggcttcaca actctggtgg aatgagttaa aagagtacgg cgtcggccca 

80 tccaacgtcc ttaccactgc gttatggaat agacccaaca tgcagattgg tcactacacc 

82 cagatggcat gggacaccac ctacaaactt ggatgtgcag ttgttttctg caatgatttc 

84 acattcggcg tttgtcagta tgggccagga ggcaattaca tgggtcatgt catctacact 

86 atgggccagc cgtgctctca gtgttcgcct ggtgctactt gcagcgtgac cgaaggcttg 

88 tgcagcgctc cttaatcagt caacaataaa tatcttacag tgatgttgtt gcttacaaat 

90 tgcttctttt ccaatagaaa taccaatgtc aacatcacga gtttctttaa attcatcact 

92 tccactacta ggggtgattt gaataaaatt tcatttcata aagcaattac atccgcaaaa 

94 aaaaaaaaaa a 

97 <210> SEQ ID NO: 2 

98 <211> LENGTH: 424 

99 <212 > TYPE: PRT 

100 <2 13 > ORGANISM: Necator americanus 

102 <400> SEQUENCE: 2 

104 Met Phe Ser Pro Val Val Val Ser Val Val Phe Thr lie Ala Phe Cys 

105 15 10 15 

108 Asn Ala Ser Pro Ala Arg Asp Ser Phe Gly Cys Ser Asn Ser Gly lie 

109 20 25 30 

112 Thr Asp Ser Asp Arg Gin Ala Phe Leu Asp Phe His Asn Asn Ala Arg 

113 35 40 45 

116 Arg Arg Val Ala Lys Gly Leu Glu Asp Ser Asn Ser Gly Lys Leu Asn 

117 50 55 60 

120 Pro Ala Lys Asn Met Tyr Lys Leu Ser Trp Asp Cys Ala Met Glu Gin 

121 65 70 75 80 

124 Gin Leu Gin Asp Ala lie Gin Ser Cys Pro Ser Gly Phe Ala Gly lie 

125 85 90 95 

128 Gin Gly Val Ala Gin Asn Thr Met Ser Trp Ser Ser Ser Gly Gly Tyr 

129 100 105 110 

132 Pro Asp Pro Ser Val Lys lie Glu Pro Thr Leu Ser Gly Trp Trp Ser 

133 115 120 125 

136 Gly Ala Lys Lys Asn Gly Val Gly Pro Asp Asn Lys Tyr Thr Gly Gly 

137 130 135 140 

140 Gly Leu Phe Ala Phe Ser Asn Met Val Tyr Ser Glu Thr Thr Lys Leu 

141 145 150 155 160 

144 Gly Cys Ala Tyr Lys Val Cys Gly Thr Lys Leu Ala Val Ser Cys lie 

145 165 170 175 

148 Tyr Asn Gly Val Gly Tyr lie Thr Asn Gin Pro Met Trp Glu Thr Gly 

149 180 185 190 

152 Gin Ala Cys Gin Thr Gly Ala Asp Cys Ser Thr Tyr Lys Asn Ser Gly 

153 195 200 205 

156 Cys Glu Asp Gly Leu Cys Thr Lys Gly Pro Asp Val Pro Glu Thr Asn 

157 210 215 220 

160 Gin Gin Cys Pro Ser Asn Thr Gly Met Thr Asp Ser Val Arg Asp Thr 

161 225 230 235 240 



720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1451 
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PATENT APPLICATION: US/10/825 , 692A TIME: 10:14:42 



Input Set : A:\substitute Sequence Listing.txt 
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164 


Phe 


Leu 


Ser 


Val 


His 


Asn 


Glu 


Phe 


Arg 


Ser 


Ser 


Val 


Ala 


Arg 


Gly 


Leu 


165 










245 










250 










255 




168 


Glu 


Pro 


Asp Ala 


Leu 


Gly 


Gly 


Asn 


Ala 


Pro 


Lys 


Ala 


Ala 


Lys 


Met 


Leu 


169 








260 










265 










270 






172 


Lys 


Met 


Val 


Tyr 


Asp 


Cys 


Glu 


Val 


Glu 


Ala 


Ser 


Ala 


lie 


Arg 


His 


Gly 


173 






275 










280 










285 








176 


Asn 


Lys 


Cys 


Val 


Tyr 


Gin 


His 


Ser 


His 


Gly 


Glu 


Asp 


Arg 


Pro 


Gly 


Leu 


177 




290 










295 










300 










180 


Gly 


Glu 


Asn 


He 


Tyr 


Lys 


Thr 


Ser 


Val 


Leu 


Lys 


Phe 


Asp 


Lys 


Asn 


Lys 


181 


305 










310 










315 










320 


184 


Ala 


Ala 


Lys 


Gin 


Ala 


Ser 


Gin 


Leu 


Trp 


Trp 


Asn 


Glu 


Leu 


Lys 


Glu 


Tyr 


185 










325 










330 










335 




188 


Gly 


Val 


Gly 


Pro 


Ser 


Asn 


Val 


Leu 


Thr 


Thr 


Ala 


Leu 


Trp 


Asn 


Arg 


Pro 


189 








340 










345 










350 






192 


Asn 


Met 


Gin 


lie 


Gly 


His 


Tyr 


Thr 


Gin 


Met 


Ala 


Trp 


Asp 


Thr 


Thr 


Tyr 


193 






355 










360 










365 








196 


Lys 


Leu 


Gly Cys 


Ala 


Val 


Val 


Phe 


Cys 


Asn 


Asp 


Phe 


Thr 


Phe 


Gly 


Val 


197 




370 










375 










380 










200 


Cys 


Gin 


Tyr Gly 


Pro 


Gly 


Gly 


Asn 


Tyr 


Met 


Gly 


His 


Val 


lie 


Tyr 


Thr 


201 


385 










390 










395 










400 


204 


Met 


Gly Gin 


Pro 


Cys 


Ser 


Gin 


Cys 


Ser 


Pro 


Gly 


Ala 


Thr 


Cys 


Ser 


Val 


205 










405 










410 










415 




208 


Thr 


Glu 


Gly Leu 


Cys 


Ser 


Ala 


Pro 



















209 420 

212 <210> SEQ ID NO: 3 

213 <211> LENGTH: 1893 

214 <212> TYPE: DNA 

215 <213> ORGANISM: Necator americanus 

217 <400> SEQUENCE: 3 

218 ggtactgcag ggtttaatta cccaagtttg agacccaacg ccatgatttg gcgaacgtgg 60 

220 caagttctcg tggttctgta tgcggcgctg tccattacag ttgtgaacgc ctataaacac 120 

222 attagctccg atcacgttgt aaatacaaca ctgggtcaga ttcgaggagt accacagaat 180 

224 ttcgaaggca aaaaagttac cgcttttctt ggtgtgccat atggtcaacc accgactggg 240 

226 gaactacgat tcagcaatcc gaaaatggtg cagcgttggg aaggtataaa gaatgctaca 300 

228 acaccggctc agccatgctt ccacttccct gacagtaaat ttaagggatt tcgtgggtca 360 

230 gagatgtgga atccgaaagg aaatatgacc gaggattgct tgaatatgaa tatctgggtc 420 

232 ccacacgatg ctgatggttc cgtgattgta tggattttcg gaggcggctt cttcaccggt 480 

234 tcaccatctt tagatgttta caacggtact gctctagcag ccaagaaacg taccattgtt 540 

236 gtgaacataa actatcgatt gggtcccttc ggtttccttt atctcggtga tgattctcgt 600 

238 gcacaaggga atatgggact gcaagatcaa caagttgcat tgcgatgggt gcataaacat 660 

240 ataagctcct ttggtggaga tccgagaaaa gtcactcttt tcggcgaagc atcaggcgct 720 

242 gcttcagcaa ccgctcatct agcagcaccg ggaagctatg agtttttcga taagataatt 780 

244 ggcaacggtg gcacaatcat gaatagttgg gccagtcgaa caaatacatc gatgcttgag 840 

246 ctgtcaatga aacttgctga acggttgaac tgtaccaaga aaagaaaaga cccgaatact 900 

248 gtacatcgct gtttggttaa acatccagca catgtggttc taaaagaggc cgctgttgtg 960 

250 tcgtatcaaa ttggtctcgt gctgacgttt gccttcatac ccattacctc tgataagaac 1020 

252 ttcttccagg gaaatgtctt tgatcgtcta cgagataaag acattaagaa gaatgtatcc 1080 

254 attgtgcttg gtactgtaaa agacgaagca accttctttt taccctacta ctttggtcac 1140 

256 aacggtttct ctttcaataa ctcattctta gcagatgggg aagaaaacag agcactcata 1200 
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PATENT APPLICATION: US/10/825 , 692A TIME: 10:14:42 

Input Set : A:\substitute Sequence Listing.txt 
Output Set: N:\CRF4\04252006\J825692A.raw 



258 aatatatcac agtataatta tgcgatgaat gcaactgcgc catcacttga aagctcactg 1260 

260 gaaccacttt tagaagctta taagaacgtt tcgacgcgaa aagaagaagg tgaaagatta 1320 

262 cgcgatggtg ttggtcgatt catgggcgac tacttctata cctgcagcgt cattgatttc 1380 

264 gctaatatcg tctcagacat tattaatgga tctttgtata tgtattactt tactaagagg 1440 

266 tcagtggcaa atccttggcc agagtggatg ggtgtaatgc atggttatga aatagaatac 1500 

268 gaatttggac agcctttcct aaattcatca ctgtacaagg aaaagcttga aaacgaaaag 1560 

270 atcttctcga aaaatatcat gagcttttgg aaagatttca tcaagactgg tgtccctgtc 1620 

272 gatttttggc cgaaatacga tcgaaaggag cggaaagcgc tcgtacttgg cgaggaaagc 1680 

274 gtgaacaatt cttaccctaa tatgactaat gttcatggac cgtactgtga actgatcgaa 1740 

276 gaagcaaagg cgtctacaaa taatggactc accttgaaga aatacattga aggggagata 1800 

278 aaaaataacg aaacgaacgt attttgatag aatgattttg cacagaatga agaattgaat 1860 

280 atcaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa 1893 

283 <210> SEQ ID NO: 4 

284 <211> LENGTH: 594 

285 <212 > TYPE: PRT 

286 <213 > ORGANISM: Necator americanus 
288 <4 00 > SEQUENCE: 4 

290 Met lie Trp Arg Thr Trp Gin Val Leu Val Val Leu Tyr Ala Ala Leu 

291 15 10 15 

294 Ser lie Thr Val Val Asn Ala Tyr Lys His lie Ser Ser Asp His Val 

295 20 25 30 

298 Val Asn Thr Thr Leu Gly Gin lie Arg Gly Val Pro Gin Asn Phe Glu 

299 35 40 45 

302 Gly Lys Lys Val Thr Ala Phe Leu Gly Val Pro Tyr Gly Gin Pro Pro 

303 50 55 60 

306 Thr Gly Glu Leu Arg Phe Ser Asn Pro Lys Met Val Gin Arg Trp Glu 

307 65 70 75 80 

310 Gly lie Lys Asn Ala Thr Thr Pro Ala Gin Pro Cys Phe His Phe Pro 

311 85 90 95 

314 Asp Ser Lys Phe Lys Gly Phe Arg Gly Ser Glu Met Trp Asn Pro Lys 

315 100 105 110 

318 Gly Asn Met Thr Glu Asp Cys Leu Asn Met Asn lie Trp Val Pro His 

319 115 120 125 

322 Asp Ala Asp Gly Ser Val He Val Trp lie Phe Gly Gly Gly Phe Phe 

323 130 135 140 

326 Thr Gly Ser Pro Ser Leu Asp Val Tyr Asn Gly Thr Ala Leu Ala Ala 

327 145 150 155 160 

330 Lys Lys Arg Thr lie Val Val Asn lie Asn Tyr Arg Leu Gly Pro Phe 

331 165 170 175 

334 Gly Phe Leu Tyr Leu Gly Asp Asp Ser Arg Ala Gin Gly Asn Met Gly 

335 180 185 190 

338 Leu Gin Asp Gin Gin Val Ala Leu Arg Trp Val His Lys His lie Ser 

339 195 200 205 

342 Ser Phe Gly Gly Asp Pro Arg Lys Val Thr Leu Phe Gly Glu Ala Ser 

343 210 215 220 

346 Gly Ala Ala Ser Ala Thr Ala His Leu Ala Ala Pro Gly Ser Tyr Glu 

347 225 230 235 240 

350 Phe Phe Asp Lys lie lie Gly Asn Gly Gly Thr lie Met Asn Ser Trp 

351 245 250 255 
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RAW SEQUENCE LISTING DATE: 04/25/2006 

PATENT APPLICATION: US/10/825 , 692A TIME: 10:14:42 

Input Set : A:\substitute Sequence Listing.txt 
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354 


Ala 


Ser 


Arg 


Thr 


Asn 


Thr 


Ser 


Met 


Leu 


Glu 


Leu 


Ser 


Met 


Lys 


Leu 


Ala 


355 








260 










265 










270 






358 


Glu 


Arg 


Leu 


Asn 


Cys 


Thr 


Lys 


Lys 


Arg 


Lys 


Asp 


Pro 


Asn 


Thr 


Val 


His 


359 






275 










280 










285 








362 


Arg 


Cys 


Leu 


Val 


Lys 


His 


Pro 


Ala 


His 


Val 


Val 


Leu 


Lys 


Glu 


Ala 


Ala 


363 




290 










295 










300 










366 


Val 


Val 


Ser 


Tyr 


Gin 


lie 


Gly 


Leu 


Val 


Leu 


Thr 


Phe 


Ala 


Phe 


lie 


Pro 


367 


305 










310 










315 










320 


370 


He 


Thr 


Ser 


Asp 


Lys 


Asn 


Phe 


Phe 


Gin 


Gly 


Asn 


Val 


Phe 


Asp 


Arg 


Leu 


371 










325 










330 










335 




374 


Arg 


Asp 


Lys 


Asp 


lie 


Lys 


Lys 


Asn 


Val 


Ser 


lie 


Val 


Leu 


Gly 


Thr 


Val 


375 








340 










345 










350 






378 


Lys 


Asp 


Glu 


Ala 


Thr 


Phe 


Phe 


Leu 


Pro 


Tyr 


Tyr 


Phe 


Gly 


His 


Asn 


Gly 


379 






355 










360 










365 








382 


Phe 


Ser 


Phe 


Asn 


Asn 


Ser 


Phe 


Leu 


Ala 


Asp 


Gly 


Glu 


Glu 


Asn 


Arg 


Ala 


383 




370 










375 










380 










386 


Leu 


lie 


Asn 


lie 


Ser 


Gin 


Tyr 


Asn 


Tyr 


Ala 


Met 


Asn 


Ala 


Thr 


Ala 


Pro 


387 


385 










390 










395 










400 


390 


Ser 


Leu 


Glu 


Ser 


Ser 


Leu 


Glu 


Pro 


Leu 


Leu 


Glu 


Ala 


Tyr 


Lys 


Asn 


Val 


391 










405 










410 










415 




394 


Ser 


Thr 


Arg 


Lys 


Glu 


Glu 


Gly 


Glu 


Arg 


Leu 


Arg 


Asp 


Gly 


Val 


Gly 


Arg 


395 








420 










425 










430 






398 


Phe 


Met 


Gly 


Asp 


Tyr 


Phe 


Tyr 


Thr 


Cys 


Ser 


Val 


lie 


Asp 


Phe 


Ala 


Asn 


399 






435 










440 










445 








402 


lie 


Val 


Ser 


Asp 


lie 


lie 


Asn 


Gly 


Ser 


Leu 


Tyr 


Met 


Tyr 


Tyr 


Phe 


Thr 


403 




450 










455 










460 










406 


Lys 


Arg 


Ser 


Val 


Ala 


Asn 


Pro 


Trp 


Pro 


Glu 


Trp 


Met 


Gly 


Val 


Met 


His 


407 


465 










470 










475 










480 


410 


Gly 


Tyr 


Glu 


lie 


Glu 


Tyr 


Glu 


Phe 


Gly 


Gin 


Pro 


Phe 


Leu 


Asn 


Ser 


Ser 


411 










485 










490 










495 




414 


Leu 


Tyr 


Lys 


Glu 


Lys 


Leu 


Glu 


Asn 


Glu 


Lys 


lie 


Phe 


Ser 


Lys 


Asn 


lie 


415 








500 










505 










510 






418 


Met 


Ser 


Phe 


Trp 


Lys 


Asp 


Phe 


lie 


Lys 


Thr 


Gly 


Val 


Pro 


Val 


Asp 


Phe 


419 






515 










520 










525 








422 


Trp 


Pro 


Lys 


Tyr 


Asp 


Arg 


Lys 


Glu 


Arg 


Lys 


Ala 


Leu 


Val 


Leu 


Gly 


Glu 


423 




530 










535 










540 










426 


Glu 


Ser 


Val 


Asn 


Asn 


Ser 


Tyr 


Pro 


Asn 


Met 


Thr 


Asn 


Val 


His 


Gly 


Pro 


427 


545 










550 










555 










560 


430 


Tyr 


Cys 


Glu 


Leu 


lie 


Glu 


Glu 


Ala 


Lys 


Ala 


Ser 


Thr 


Asn 


Asn 


Gly 


Leu 


431 










565 










570 










575 




434 


Thr 


Leu 


Lys 


Lys 


Tyr 


lie 


Glu 


Gly 


Glu 


lie 


Lys 


Asn 


Asn 


Glu 


Thr 


Asn 


435 








580 










585 










590 






438 


Val 


Phe 































442 <210> SEQ ID NO: 5 

443 <211> LENGTH: 1344 

444 <212> TYPE: DNA 

445 <213 > ORGANISM: Necator americanus 

447 <40 0> SEQUENCE: 5 

448 ctcgtgccga attcggcacg agctccattc atcatgcagc gatcattcct acttctactt 60 
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RAW SEQUENCE LISTING ERROR SUMMARY DATE: 04/25/2006 

PATENT APPLICATION: US/10/825 , 692A TIME: 10:14:43 

Input Set : A:\substitute Sequence Listing.txt 
Output Set: N:\CRF4\04252006\J825692A.raw 



Please Note: 

Use of n and/or Xaa have been detected in the Sequence Listing. Please review the 
Sequence Listing to ensure that a corresponding explanation is presented in the <220> 
to <223 > fields of: ea^i sequence which presents at least one n or Xaa. 

Seq# : 51 ; N Pos . 27,353,366,394,413 

Invalid <213> Response; 

Use of "Artificial" only as "<213> Organism" response is incomplete, 

per 1.823(b) of New Sequence Rules. Valid response is Artificial Sequence. 

Seq#: 65, 66, 70, 71, 72, 73, 74, 75, 78, 7 9, 80, 81, 115, 116 
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• VERIFICATION SUMMARY DATE: 04/25/2006 

PATENT APPLICATION: US/10/825 , 692A TIME: 10:14:43 

Input Set : A:\substitute Sequence Listlng.txt 
Output Set: N:\CRF4\04252006\J825692A.raw 

L : 4146 M: 341 W: (46) "n" or "Xaa" used, for SEQ ID#:51 after pos . : 0 

M : 341 Repeated in SeqNo=51 
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